Regularized Structured Output Learning with Partial Labels
نویسندگان
چکیده
We consider the problem of learning structured output probabilistic models with training examples having partial labels. Partial label scenarios arise commonly in web applications such as taxonomy (hierarchical) classification, multi-label classification and information extraction from web pages. For example, label information may be available only at the internal node level (not at the leaf level) for some pages in a taxonomy classification problem. In a multi-label classification problem, it may be available only for some of the classes (in each example). Similarly, in a sequence learning problem, we may have label information only for some nodes in the training sequences. Conventionally, marginal likelihood maximization technique has been used to solve these problems. In such a solution unlabeled examples and any side information like expected label distribution (or correlation in a multi-label setting) of the unlabeled part are not used. We solve these problems by incorporating entropy and label distribution or correlation regularizations along with marginal likelihood. Entropy and label distribution regularizations have been used previously in semi-supervised learning with fully unlabeled examples. In this paper we develop probabilistic taxonomy and multi-label classifier models, and provide the ideas needed for expanding their usage to the partial label scenario. Experiments on real-life taxonomy and multilabel learning problems show that significant improvements in accuracy are achieved by incorporating these regularizations, when most of the examples are only partially labeled.
منابع مشابه
Consistency of structured output learning with missing labels
In this paper we study statistical consistency of partial losses suitable for learning structured output predictors from examples containing missing labels. We provide sufficient conditions on data generating distribution which admit to prove that the expected risk of the structured predictor learned by minimizing the partial loss converges to the optimal Bayes risk defined by an associated com...
متن کاملStructured Prediction by Conditional Risk Minimization
We propose a general approach for supervised learning with structured output spaces, such as combinatorial and polyhedral sets, that is based on minimizing estimated conditional risk functions. Given a loss function defined over pairs of output labels, we first estimate the conditional risk function by solving a (possibly infinite) collection of regularized least squares problems. A prediction ...
متن کاملMultilabel Classification through Structured Output Learning - Methods and Applications
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Hongyu Su Name of the doctoral dissertation Multilabel Classification through Structured Output Learning Methods and Applications Publisher School of Science Unit Department of Computer Science Series Aalto University publication series DOCTORAL DISSERTATIONS 28/2015 Field of research Information and Computer Science Manuscrip...
متن کاملRegularized Multi-Concept MIL for weakly-supervised facial behavior categorization
In this work, we address the problem of estimating high-level semantic labels for videos of recorded people by means of analysing their facial expressions. This problem, to which we refer as facial behavior categorization, is a weakly-supervised learning problem where we do not have access to frame-by-frame facial gesture annotations but only weak-labels at the video level are available. Theref...
متن کاملStructured Output Learning with Candidate Labels for Local Parts
This paper introduces a special setting of weakly supervised structured output learning, where the training data is a set of structured instances and supervision involves candidate labels for some local parts of the structure. We show that the learning problem with this weak supervision setting can be efficiently handled and then propose a large margin formulation. To solve the non-convex optim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012